7 research outputs found

    Service Level Agreement-based GDPR Compliance and Security assurance in (multi)Cloud-based systems

    Get PDF
    Compliance with the new European General Data Protection Regulation (Regulation (EU) 2016/679) and security assurance are currently two major challenges of Cloud-based systems. GDPR compliance implies both privacy and security mechanisms definition, enforcement and control, including evidence collection. This paper presents a novel DevOps framework aimed at supporting Cloud consumers in designing, deploying and operating (multi)Cloud systems that include the necessary privacy and security controls for ensuring transparency to end-users, third parties in service provision (if any) and law enforcement authorities. The framework relies on the risk-driven specification at design time of privacy and security level objectives in the system Service Level Agreement (SLA) and in their continuous monitoring and enforcement at runtime.The research leading to these results has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 644429 and No 780351, MUSA project and ENACT project, respectively. We would also like to acknowledge all the members of the MUSA Consortium and ENACT Consortium for their valuable help

    CrowdWON: A Modelling Language for Crowd Processes based on Workflow Nets

    No full text
    Although crowdsourcing has been proven efficient as a mechanism to solve independent tasks for on-line production, it is still unclear how to define and manage workflows in complex tasks that require the participation and coordination of different workers. Despite the existence of different frameworks to define workflows, we still lack a commonly accepted solution that is able to describe the most common workflows in current and future platforms. In this paper, we propose CrowdWON, a new graphical framework to describe and monitor crowd processes, the proposed language is able to represent the workflow of most well-known existing applications, extend previous modelling frameworks, and assist in the future generation of crowdsourcing platforms. Beyond previous proposals, CrowdWON allows for the formal definition of adaptative workflows, that depend on the skills of the crowd workers and/or process deadlines. CrowdWON also allows expressing constraints on workers based on previous individual contributions. Finally, we show how our proposal can be used to describe well known crowdsourcing workflows

    A Model-based Approach to Realize Privacy and Data Protection by Design

    No full text
    International audienceTelecommunications and data are pervasive in almost each aspect of our every-day life and new concerns progressively arise as a result of stakes related to privacy and data protection. Indeed, systems development becomes data-centric leading to an ecosystem where a variety of players intervene (citizens, industry, regulators) and where the policies regarding data usage and utilization are far from consensual. The new General Data Protection Regulation (GDPR) enacted by the European Commission in 2018 has introduced new provisions including principles for lawfulness, fairness, transparency, etc. thus endorsing data subjects with new rights in regards to their personal data. In this context, a growing need for approaches that conceptualize and help engineers to integrate GDPR and privacy provisions at design time becomes paramount. This paper presents a comprehensive approach to support different phases of the design process with special attention to the integration of privacy and data protection principles. Among others, it is a generic model-based approach that can be specialized according to the specifics of different application domains

    Using Machine Learning to Optimize Parallelism in Big Data Applications

    Get PDF
    In-memory cluster computing platforms have gained momentum in the last years, due to their ability to analyse big amounts of data in parallel. These platforms are complex and difficult-to-manage environments. In addition, there is a lack of tools to better understand and optimize such platforms that consequently form backbone of big data infrastructure and technologies. This directly leads to underutilization of available resources and application failures in such environment. One of the key aspects that can address this problem is optimization of the task parallelism of application in such environments. In this paper, we propose a machine learning based method that recommends optimal parameters for task parallelization in big data workloads. By monitoring and gathering metrics at system and application level, we are able to find statistical correlations that allow us to characterize and predict the effect of different parallelism settings on performance. These predictions are used to recommend an optimal configuration to users before launching their workloads in the cluster, avoiding possible failures, performance degradation and wastage of resources. We evaluate our method with a benchmark of 15 Spark applications on the Grid5000 testbed. We observe up to a 51\% gain on performance when using the recommended parallelism settings. The model is also interpretable and can give insights to the user into how different metrics and parameters affect the performance

    ParallelGDB: A Parallel Graph Database Based on Cache Specialization

    No full text
    International audienceThe need for managing massive attributed graphs is becoming common in many areas such as recommendation systems, proteomics analysis, social network analysis or bibliographic analysis. This is making it necessary to move towards parallel systems that allow managing graph databases containing millions of vertices and edges. Previous work on distributed graph databases has focused on finding ways to partition the graph to reduce network traffic and improve execution time. However, partitioning a graph and keeping the information regarding the location of vertices might be unrealistic for massive graphs. In this paper, we propose Parallel-GDB, a new system based on specializing the local caches of any node in this system, providing a better cache hit ratio. ParallelGDB uses a random graph partitioning, avoiding complex partition methods based on the graph topology, that usually require managing extra data structures. This proposed system provides an efficient environment for distributed graph databases

    Graph partitioning strategies for efficient BFS in shared-nothing parallel systems

    No full text
    International audienceTraversing massive graphs as efficiently as possible is essential for many applications. Many common operations on graphs, such as calculating the distance between two nodes, are based on the Breadth First Search traversal. However, because of the exhaustive exploration of all the nodes and edges of the graph, this operation might be very time consuming. A possible solution is distributing the graph among the nodes of a shared-nothing parallel system. Nevertheless, this operation may generate a large amount of inter-node communication. In this paper, we propose two graph partitioning techniques and improve previous distributed versions of BFS in order to reduce this communication
    corecore